A bioinformatics approach for identifying transgene insertion sites using whole genome sequencing data
نویسندگان
چکیده
BACKGROUND Genetically modified crops (GM crops) have been developed to improve the agricultural traits of modern crop cultivars. Safety assessments of GM crops are of paramount importance in research at developmental stages and before releasing transgenic plants into the marketplace. Sequencing technology is developing rapidly, with higher output and labor efficiencies, and will eventually replace existing methods for the molecular characterization of genetically modified organisms. METHODS To detect the transgenic insertion locations in the three GM rice gnomes, Illumina sequencing reads are mapped and classified to the rice genome and plasmid sequence. The both mapped reads are classified to characterize the junction site between plant and transgene sequence by sequence alignment. RESULTS Herein, we present a next generation sequencing (NGS)-based molecular characterization method, using transgenic rice plants SNU-Bt9-5, SNU-Bt9-30, and SNU-Bt9-109. Specifically, using bioinformatics tools, we detected the precise insertion locations and copy numbers of transfer DNA, genetic rearrangements, and the absence of backbone sequences, which were equivalent to results obtained from Southern blot analyses. CONCLUSION NGS methods have been suggested as an effective means of characterizing and detecting transgenic insertion locations in genomes. Our results demonstrate the use of a combination of NGS technology and bioinformatics approaches that offers cost- and time-effective methods for assessing the safety of transgenic plants.
منابع مشابه
CONTRAILS: A tool for rapid identification of transgene integration sites in complex, repetitive genomes using low-coverage paired-end sequencing
Transgenic crops have become a staple in modern agriculture, and are typically characterized using a variety of molecular techniques involving proteomics and metabolomics. Characterization of the transgene insertion site is of great interest, as disruptions, deletions, and genomic location can affect product selection and fitness, and identification of these regions and their integrity is requi...
متن کاملVirusSeq: software to identify viruses and their integration sites using next-generation sequencing of human cancer tissue
SUMMARY We developed a new algorithmic method, VirusSeq, for detecting known viruses and their integration sites in the human genome using next-generation sequencing data. We evaluated VirusSeq on whole-transcriptome sequencing (RNA-Seq) data of 256 human cancer samples from The Cancer Genome Atlas. Using these data, we showed that VirusSeq accurately detects the known viruses and their integra...
متن کاملIdentification of Genomic Insertion and Flanking Sequence of G2-EPSPS and GAT Transgenes in Soybean Using Whole Genome Sequencing Method
Molecular characterization of sequence flanking exogenous fragment insertion is essential for safety assessment and labeling of genetically modified organism (GMO). In this study, the T-DNA insertion sites and flanking sequences were identified in two newly developed transgenic glyphosate-tolerant soybeans GE-J16 and ZH10-6 based on whole genome sequencing (WGS) method. More than 22.4 Gb sequen...
متن کاملIdentifying insertion mutations by whole-genome sequencing.
Insertion mutagenesis via mobile genetic element is a common technique for the analysis of gene function in model organisms. Next-generation sequencing offers an attractive approach for localizing the site of insertion, but alignment-based mapping of mobile genetic elements is challenging. A computational method for identifying insertion sites is reported herein. The technique was validated by ...
متن کاملHideNseek, a post-genome approach to locate transgenes exemplified in Arabidopsis thaliana
SUMMARY Determination of transgene location is essential for investigating the effects of position on transgene expression levels and facilitates cloning of the resident gene affected by insertion. Currently used PCR-based approaches for determination of transgene location are relatively complicated and often fail when the transgene is duplicated, rearranged or fragmented. HideNseek is a new bi...
متن کامل